185 research outputs found
Asymptotics for high-dimensional covariance matrices and quadratic forms with applications to the trace functional and shrinkage
We establish large sample approximations for an arbitray number of bilinear
forms of the sample variance-covariance matrix of a high-dimensional vector
time series using -bounded and small -bounded weighting
vectors. Estimation of the asymptotic covariance structure is also discussed.
The results hold true without any constraint on the dimension, the number of
forms and the sample size or their ratios. Concrete and potential applications
are widespread and cover high-dimensional data science problems such as tests
for large numbers of covariances, sparse portfolio optimization and projections
onto sparse principal components or more general spanning sets as frequently
considered, e.g. in classification and dictionary learning. As two specific
applications of our results, we study in greater detail the asymptotics of the
trace functional and shrinkage estimation of covariance matrices. In shrinkage
estimation, it turns out that the asymptotics differs for weighting vectors
bounded away from orthogonaliy and nearly orthogonal ones in the sense that
their inner product converges to 0.Comment: 42 page
Time-frequency analysis of locally stationary Hawkes processes
Locally stationary Hawkes processes have been introduced in order to
generalise classical Hawkes processes away from stationarity by allowing for a
time-varying second-order structure. This class of self-exciting point
processes has recently attracted a lot of interest in applications in the life
sciences (seismology, genomics, neuro-science,...), but also in the modelling
of high-frequency financial data. In this contribution we provide a fully
developed nonparametric estimation theory of both local mean density and local
Bartlett spectra of a locally stationary Hawkes process. In particular we apply
our kernel estimation of the spectrum localised both in time and frequency to
two data sets of transaction times revealing pertinent features in the data
that had not been made visible by classical non-localised approaches based on
models with constant fertility functions over time.Comment: Bernoulli journal, A Para{\^i}tr
Locally stationary long memory estimation
There exists a wide literature on modelling strongly dependent time series
using a longmemory parameter d, including more recent work on semiparametric
wavelet estimation. As a generalization of these latter approaches, in this
work we allow the long-memory parameter d to be varying over time. We embed our
approach into the framework of locally stationary processes. We show weak
consistency and a central limit theorem for our log-regression wavelet
estimator of the time-dependent d in a Gaussian context. Both simulations and a
real data example complete our work on providing a fairly general approach
A Multiscale Approach for Statistical Characterization of Functional Images
Increasingly, scientific studies yield functional image data, in which the observed data consist of sets of curves recorded on the pixels of the image. Examples include temporal brain response intensities measured by fMRI and NMR frequency spectra measured at each pixel. This article presents a new methodology for improving the characterization of pixels in functional imaging, formulated as a spatial curve clustering problem. Our method operates on curves as a unit. It is nonparametric and involves multiple stages: (i) wavelet thresholding, aggregation, and Neyman truncation to effectively reduce dimensionality; (ii) clustering based on an extended EM algorithm; and (iii) multiscale penalized dyadic partitioning to create a spatial segmentation. We motivate the different stages with theoretical considerations and arguments, and illustrate the overall procedure on simulated and real datasets. Our method appears to offer substantial improvements over monoscale pixel-wise methods. An Appendix which gives some theoretical justifications of the methodology, computer code, documentation and dataset are available in the online supplements
Intrinsic data depth for Hermitian positive definite matrices
Nondegenerate covariance, correlation and spectral density matrices are
necessarily symmetric or Hermitian and positive definite. The main contribution
of this paper is the development of statistical data depths for collections of
Hermitian positive definite matrices by exploiting the geometric structure of
the space as a Riemannian manifold. The depth functions allow one to naturally
characterize most central or outlying matrices, but also provide a practical
framework for inference in the context of samples of positive definite
matrices. First, the desired properties of an intrinsic data depth function
acting on the space of Hermitian positive definite matrices are presented.
Second, we propose two computationally fast pointwise and integrated data depth
functions that satisfy each of these requirements and investigate several
robustness and efficiency aspects. As an application, we construct depth-based
confidence regions for the intrinsic mean of a sample of positive definite
matrices, which is applied to the exploratory analysis of a collection of
covariance matrices associated to a multicenter research trial
Fitting dynamic factor models to non-stationary time series
Factor modelling of a large time series panel has widely proven useful to reduce its cross-sectional dimensionality. This is done by explaining common co-movements in the panel through the existence of a small number of common components, up to some idiosyncratic behaviour of each individual series. To capture serial correlation in the common components, a dynamic structure is used as in traditional (uni- or multivariate) time series analysis of second order structure, i.e. allowing for infinite-length filtering of the factors via dynamic loadings. In this paper, motivated from economic data observed over long time periods which show smooth transitions over time in their covariance structure, we allow the dynamic structure of the factor model to be non-stationary over time, by proposing a deterministic time variation of its loadings. In this respect we generalise existing recent work on static factor models with time-varying loadings as well as the classical, i.e. stationary, dynamic approximate factor model. Motivated from the stationary case, we estimate the common components of our dynamic factor model by the eigenvectors of a consistent estimator of the now time-varying spectral density matrix of the underlying data-generating process. This can be seen as time-varying principal components approach in the frequency domain. We derive consistency of this estimator in a "double-asymptotic" framework of both cross-section and time dimension tending to infinity. A simulation study illustrates the performance of our estimators.econometrics;
Multiariate Wavelet-based sahpe preserving estimation for dependant observation
We present a new approach on shape preserving estimation of probability distribution and density functions using wavelet methodology for multivariate dependent data. Our estimators preserve shape constraints such as monotonicity, positivity and integration to one, and allow for low spatial regularity of the underlying functions. As important application, we discuss conditional quantile estimation for financial time series data. We show that our methodology can be easily implemented with B-splines, and performs well in a finite sample situation, through Monte Carlo simulations.Conditional quantile; time series; shape preserving wavelet estimation; B-splines; multivariate process
Structural shrinkage of nonparametric spectral estimators for multivariate time series
In this paper we investigate the performance of periodogram based estimators
of the spectral density matrix of possibly high-dimensional time series. We
suggest and study shrinkage as a remedy against numerical instabilities due to
deteriorating condition numbers of (kernel) smoothed periodogram matrices.
Moreover, shrinking the empirical eigenvalues in the frequency domain towards
one another also improves at the same time the Mean Squared Error (MSE) of
these widely used nonparametric spectral estimators. Compared to some existing
time domain approaches, restricted to i.i.d. data, in the frequency domain it
is necessary to take the size of the smoothing span as "effective or local
sample size" into account. While B\"{o}hm and von Sachs (2007) proposes a
multiple of the identity matrix as optimal shrinkage target in the absence of
knowledge about the multidimensional structure of the data, here we consider
"structural" shrinkage. We assume that the spectral structure of the data is
induced by underlying factors. However, in contrast to actual factor modelling
suffering from the need to choose the number of factors, we suggest a
model-free approach. Our final estimator is the asymptotically MSE-optimal
linear combination of the smoothed periodogram and the parametric estimator
based on an underfitting (and hence deliberately misspecified) factor model. We
complete our theoretical considerations by some extensive simulation studies.
In the situation of data generated from a higher-order factor model, we compare
all four types of involved estimators (including the one of B\"{o}hm and von
Sachs (2007)).Comment: Published in at http://dx.doi.org/10.1214/08-EJS236 the Electronic
Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- âŠ